refactor(eval): split progress reporter into strategy-based reporting package#1711
Open
Chibionos wants to merge 1 commit into
Open
refactor(eval): split progress reporter into strategy-based reporting package#1711Chibionos wants to merge 1 commit into
Chibionos wants to merge 1 commit into
Conversation
… package Reimplements the design from #1040 (closed as stale) on current main: the 1475-line _progress_reporter.py monolith threaded is_coded booleans through every method to switch between the legacy and coded StudioWeb evaluation APIs. The differences (endpoint routing, GUID conversion, eval snapshot shape, result collection format, update payload keys) now live in strategy classes under _cli/_evals/_reporting/: - _strategy_protocol.py: EvalReportingStrategy protocol - _legacy_strategy.py: GUID ids, assertionRuns, no path segment - _coded_strategy.py: string ids, evaluatorRuns, coded/ segment - _reporter.py: event handling, HTTP plumbing, per-execution state - _models.py / _utils.py / _strategies.py: shared pieces + selection _progress_reporter.py remains as a compatibility shim. Behavior is unchanged: the existing 61-test progress reporter suite passes without modification; 24 new strategy unit tests added. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Refactors StudioWeb evaluation progress reporting by extracting the legacy-vs-coded API differences into a strategy-based _reporting/ package, while keeping _progress_reporter.py as a compatibility shim so existing CLI imports continue to work.
Changes:
- Adds strategy protocol + legacy/coded strategy implementations and strategy selection helpers under
uipath/_cli/_evals/_reporting/. - Moves the
StudioWebProgressReporterimplementation into_reporting/_reporter.pyand re-exports the previous public surface from_progress_reporter.py. - Adds unit tests for strategy behavior and bumps the
uipathpackage version to2.10.83.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/uipath/uv.lock | Bumps locked uipath version to 2.10.83. |
| packages/uipath/pyproject.toml | Bumps package version to 2.10.83. |
| packages/uipath/tests/cli/eval/test_reporting_strategies.py | Adds unit tests covering strategy selection, ID conversion, snapshot/payload shapes. |
| packages/uipath/src/uipath/_cli/_evals/_reporting/init.py | Defines the new reporting package exports (strategies, models, reporter, helpers). |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_strategy_protocol.py | Introduces EvalReportingStrategy protocol describing API-shaping responsibilities. |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_strategies.py | Adds singleton strategies + strategy_for / is_coded_evaluators selection utilities. |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_coded_strategy.py | Implements coded API behavior (string IDs, coded/ routing, evaluatorRuns/scores). |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_legacy_strategy.py | Implements legacy API behavior (GUID conversion, assertionRuns/evaluatorScores). |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_models.py | Extracts shared models (status enum, progress item, agent snapshot). |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_utils.py | Extracts shared helpers (error decorator, deterministic GUID, usage extraction, env parsing). |
| packages/uipath/src/uipath/_cli/_evals/_reporting/_reporter.py | New StudioWebProgressReporter implementation using the strategy package. |
| packages/uipath/src/uipath/_cli/_evals/_progress_reporter.py | Compatibility shim re-exporting the prior public API from _reporting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+562
to
+571
| # Check if we already have an eval_run_id cached | ||
| existing_eval_run_id = self.eval_run_ids.get(payload.execution_id) | ||
|
|
||
| if existing_eval_run_id: | ||
| # Already have eval_run_id (from previous fetch or creation) | ||
| logger.info( | ||
| f"Using cached eval_run_id={existing_eval_run_id} for execution_id={payload.execution_id} " | ||
| f"(skipping backend fetch/create)" | ||
| ) | ||
| return |
Comment on lines
+1121
to
+1125
| Args: | ||
| eval_set_id: The ID of the eval set | ||
| eval_set_run_id: The ID of the eval set run | ||
| evaluation_id: Optional evaluation ID to filter for a specific eval run | ||
| is_coded: Whether this is a coded evaluation (vs legacy) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reimplements the strategy-pattern reporting refactor from #1040 (closed as stale — it predated the monorepo migration and was 540 commits behind) on current main. Stacked on #1710 (uipath-eval package extraction); will be retargeted to
mainonce that merges.The 1475-line
_progress_reporter.pymonolith threadedis_codedbooleans through every method to switch between the legacy and coded StudioWeb evaluation APIs. Those differences now live in dedicated strategy classes.Design
packages/uipath/src/uipath/_cli/_evals/_reporting/:_strategy_protocol.pyEvalReportingStrategyprotocol: endpoint suffix, ID conversion, eval snapshot shape, result collection, update payload_legacy_strategy.pyassertionRuns+evaluatorScores, no path segment_coded_strategy.pyevaluatorRuns+scores,coded/path segment_strategies.pystrategy_for,is_coded_evaluators)_reporter.pyStudioWebProgressReporter: event handling, HTTP plumbing, per-execution state, resume flow_models.py/_utils.py_progress_reporter.pyremains as a compatibility shim re-exporting the public surface, so existing importers (cli_eval.py, tests, anything external reaching into_cli) are unaffected.Mixed coded/legacy eval sets keep their existing behavior: results are collected by both strategies (each skips evaluators it doesn't own) and the active strategy shapes the update payload.
Behavior preservation
test_progress_reporter.pysuite passes without a single modification — payloads, endpoints, GUID conversion, resume flow, and env handling are byte-identical.src+tests), and the custom httpx linter are clean.🤖 Generated with Claude Code